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This is the second paper in a series describing a numerical implementa- 
tion of the conformal Einstein equation. This paper deals with the technical 
details of the numerical code used to perform numerical time evolutions from 
a "minimal" set of data. 
We outline the numerical construction of a complete set of data for our equa- 
qq ' tions from a minimal set of data. The second and the fourth order discreti- 

sations, which are used for the construction of the complete data set and for 
£^ ■ the numerical integration of the time evolution equations, are described and 

, their efficiencies are compared. By using the fourth order scheme we reduce 

our computer resource requirements — with respect to memory as well as 
computation time — by at least two orders of magnitude as compared to the 
second order scheme. 



On 



I. INTRODUCTION 

X 

We calculate solutions to the Einstein equation arising from hyperboloidal initial data 
by solving the conformal field equation. In the first paper in this series |3J the ideas behind 
the conformal approach, their mathematical foundation, and the benefits of it have been 
discussed in detail. In this paper we discuss the numerical details of the part of the imple- 
mentation which calculates the time evolution from a minimal set of data consisting of the 
conformal 3-metric and the conformal extrinsic curvature as well as the conformal 
factor Q and its time derivative fl . The latter relate (h ab , k a b) to the physical data. 
To be able to test the numerics described here, we take known exact solutions given in 
terms of the conformal metric g a t, and the conformal factor Q, perform numerical coordi- 
nate transformations to hide obvious symmetries, and calculate in a straightforward way a 
minimal set of data from the transformed solution. Then we extend the minimal set of data 
to a complete set of conformal data. The calculation of the complete data set is a delicate 
numerical issue. In section || we discuss why it is so and we show how to solve this problem. 
After having calculated a complete set of data we can start the actual time integration. In 



section [II we describe in detail the second and the fourth order scheme used to numerically 



integrate the symmetric hyperbolic time evolution equations. 



In the section [TV] we outline properties of the exact solutions important for the tests, the 
tests themselves, and the results of the comparisons between the two schemes. 
In the tests based on conformal Minkowski data we reconstruct the whole future of the initial 
slice as well as null infinity and even timelike infinity after a finite number of time steps. 
Obviously we can then stop the calculation. As suggested by the theorems in || we can 
expect to cover the future of the initial slice in all cases of sufficiently small asymptotically 
Minkowskian data, i. e. for all gravitational wave data not forming singularities or black 
holes, by one finite grid. 

In the other tests we use asymptotically A3 solutions (cf. subsection |1V A| ). These are solu- 
tions which in general contain gravitational radiation but are special in the sense that they 
possess a conformal structure which becomes singular towards timelike infinity. We compare 
the numerical result with the analytic solution after covering a certain integration time. To 
check stability we have also performed much larger numbers of time steps and we did not 
encounter any numerical problems, but the numerical evolution slows down for the following 
reason. In the given coordinates, the light cones in the interior become flatter and flatter the 
closer they are to the singular timelike infinity. Therefore, the Courant-Friedrichs-Levy con- 
dition must enforce smaller and smaller time steps and it prevents us from reaching timelike 
infinity in the numerical integration, unless we embark upon a more detailed study (which 
is beyond the goal of this paper). 

Comparing the second order with the fourth order scheme we found the fourth order scheme 
surpassing the second order scheme in three dimensional calculations by an at least two 
orders of magnitude more efficient use of computer resources. In all cases we tested the 
combination of the fourth order scheme with grids with 100 gridpoints in each space di- 
mensions was sufficient to achieve relative errors of less than one percent. The resulting 
moderate memory and computer time requirements allow us to do our calculations, includ- 
ing the ones with three space dimensions, on medium size parallel computers which are 
nowadays available in many physics institutes with numerical orientation. 
In appendix |A] we give a short review of the computational science aspects of the code. 



II. CONSTRUCTING A COMPLETE DATA SET 

To avoid too many repetitions, we assume the reader to be familiar with the general 
approach as well as with the equations discussed in part one of this series, and we shall refer 
to equation (n) of part one by writing (I/n). 



A. Minimal set of data from exact solutions 

The known solutions of the conformal field equations have all high degrees of symmetries, 
and usually the metric g a 'b' and the rescaling factor Q are written in terms of symmetry 
adapted coordinates x s a = (t s , x s , y s , z s ). In these coordinates many variables and major 
parts of the equations are identically zero, hence they are of limited use for testing. Things 
become more interesting if we make coordinate transformations which hide the symmetries. 
For technical reason we prescribe the inverse of the coordinate transformation, 

x/ = x s a \x a ). (1) 
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Then the components of the metric g a b in the new coordinates g a b are 

g ab (x a ) = g a , v {xs a '{x a )){d a x s a '){d h x s b '). (2) 

The derivatives are numerically calculated by evaluating fourth order stencils according 
to ( |10a| ) with respect to an imaginary grid with one tenth of the grid spacing of the real 
grid. From equation (^|) we can read off the first elements of a minimal set of data on 
the hypersurface defined by t = t Q , namely the components of the 3-metric h a b, as the 
{a;,2/,z}-components of g a b- The next element of a minimal set, the scalar Q, only changes 
its functional form under the coordinate transformation, Q(x a ) = Q(x s a (x a )). 
After numerically calculating time and space derivatives of (h a b,Q), equation (I/13a) respec- 
tively (I/13h) are used to calculate the missing elements (k a b,Qo) of a minimal set of data. 
The needed lapse A" and shift N a are obtained as the solution of the system of equations 
g tt = —N 2 + h ab N a N b and g {XjV>z}t = h {x>yjZ}b N b . 

During the time evolution we have to prescribe the gauge source functions q = \n(N/Vh), 
shift A^ a , and Ricci scalar R. The function q = \n(N/y/h) and the shift A^ a are calculated 
from the coefficients of g a b on each slice in analogy to the approach on the initial slice, the 
Ricci scalar R is given as function of the coordinates. 



B. Prom a minimal to a complete set of conformal data 

To calculate the remaining initial data (7 a & c , (0,1) i?a, ^Rab, E a b, B a b, Q a , oS) from the 
minimal set we use certain combinations of the conformal constraints (1/14). In particular, 
we use 

A/"ha6c = 0, (3a) 

A4a6 6 = 0, (3b) 

A/ 7afe a6 = 0, (3c) 

A/ na = 0, (3d) 

A/n a a a = 0, (3e) 

A/n aafe = 0, (3f) 

(3) ea cd A/k cdb = 0, (3g) 

to solve for 7 a 6c , (0 ' 1) ^ a , (1,1) -R a a , u, fa.i)R ab ■= ^ (1 ' 1] R a b, fsab ■= QB ab respectively. 
To extract (1 ' 1} R a b and B ab from /(i,i^ afe and fs a b we have to divide by Q. This division by Q 
is numerically the most delicate part of the construction of a complete set of data. It will be 
described in the next subsection. After the division we use (1,1) R a b to calculate fEab '■= QE a b 
from 

A/; acb c = 0. (3h) 
To determine E a b from fEab we again have to divide by Q. 
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C. Dividing by U 



In the smoothness of the limits lim^o q nas been analysed in detail. In a straight- 
forward implementation of the approach used there we would divide by Q outside the set S 
on the initial slice and use rHopital's rule at S, where Q vanishes. Then a picture like the 
one shown in figure [l] would result. 




x-x- x 
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FIG. 1. Sketch of C 1 - 1 ^ 
(solid line) and E a b (dashed 
line) near S as obtained by a 
combination of a division by f2 
outside S and the application 
of l'Hopital's rule at S. The 
crosses denote gridpoints. 
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gridpoint number, location of 5 1 is 9 and 41 



FIG. 2. Result of our method applied to an A3 
spacetime on a 50 3 grid. The solid line shows En + 2.5 
along a typical (y, z) = const line, which intersects S at 
gridpoints 9 and 41. The dashed line shows 100 x the 
difference to an 100 3 calculation. Its non-smoothness 
near the boundary is caused by changing from a sym- 
metric to an asymmetric stencil. It does not affect the 



boundary treatment described in subsection HI C 



The reason for the pole like structure near S is simple, as we will immediately see. Let 
us assume, for simplicity, that we have one space dimension only, that Q = at x = 0, 
and that the corresponding gridpoint has number 0. In the process of calculating 
IbJe) numerical derivatives are taken, therefore each / deviates from the exact value f e 
by a discretisation error, / = f e + af(Ax) n , where n is the order of the scheme. We now 
make a Taylor expansion of f e and Q around x = + dx and recognise that we lost near 
point one order of accuracy, since ^ = $f + d a ^ -ff (Ax)™" 1 and ^ is always of order 1 

at the neighbours of gridpoint 0. 4r changes sign at point 0, hence the form of the solid 
line in figure [l]. When dividing by Q a second time to calculate E a b we see by the same kind 
of argument and by recalling, that the pole like structure of the solid line enters into fEab, 
that we get the dashed line in figure 0. 

This non-smoothness of the discretisation error and the loss of two orders of accuracy in 
the initial data is unacceptable mainly for two reasons. Firstly, discretisation schemes for 
symmetric hyperbolic equations tend to react to these kinds of non-smoothness in the initial 
data with significant drops in the convergence order. Secondly, this defect would be most 
significant in the neighbourhood of S and would hence establish a kind of boundary at S 
which would eliminate many advantages of the conformal method. 

We tried various methods to remove the pole-like structure by some kind of smoothing 
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procedure. Although the behaviour could be improved, the phenomena never disappeared 
completely. The problematic behaviour could almost be cured by first subtracting a function 
with values of the order of the discretisation error from / on the whole grid, making / 
exactly vanish at S, and then dividing by Q.f] We did not pursue this approach beyond the 
asymptotically A3 scenarios because we did not see how to generalise it to arbitrary S and 
in particular because we found the following approach which works for arbitrary S. 
Instead of solving the equation gVt = f for g, we determine g as solution to an elliptic 
equation of the type 

^A(Q 2 g - Qf) = 0, (4) 

where the symbol (3) A denotes the Laplace operator of (3 V a , that is (3) A := (3 V a (3 V a . If we 
write this as an equation for u := Q 2 g — Qf, we see that the uniquely determined solution 
for the boundary values u | boundary = is u = 0, i. e. g = f /Q. 
Written as equation for g equation (|J) reads 

fl 2 {3 Ag + 4ft ( (3 V a ft) ( (3 V a #) + [2 ( (3 V a ft) ( (3 V a ft) + 2ft (3 Aft] g = 

f (3 Aft + 2 ( (3 V a ft) ( (3 V a /) + ft (3 A/. (5) 

If the derivatives are discretised by symmetric stencils like (|9|) or (|K]), the ( (3 V n g) term 
has a sign which is known to cause a tendency for instabilities ||. To avoid the resulting 
numerical problems we add 

-rK (3 V a ft)( (3 V a (^-/)) = (6) 

to equation (|5|) and obtain 

ft 2 (3) A g + (4 - n)n ( (3 v a ft) ( (3 v a #) + [(2 - v ) ( (3 v a ft) ( (3 v a ft) + 2^ (3 A^] g = 

f (3 Aft + (2 - 77) ( (3 V a ft) ( (3 V a /) + n (3 A/. (7) 

This is the linear elliptic equation we solve after we have numerically calculated the function 
/, which is due to discretisation errors not exactly at S. Instead of having to divide by 
Q on the whole grid we only have to divide by Q on the trueQ grid boundaries to provide 
boundary values for our elliptic equation. This does not constitute any numerical problem, 
since Q is significantly different from at the true grid boundaries. 
Equation (|7|) tells us that on S 

9 (2 - 77) ( < 3 >V a ft) ( (3) V a ft) ( < 3 >V a ft) ( (3) V a ft) ' 1 ' 

which is a correct answer, since / vanishes up to the discretisation error. Since (3 V a ft 
in any extended hyperboloidal initial value problem, expression (^|) is well-defined. 



x J6rg Frauendiener has developed a sophisticated way to combine this kind of idea with spectral 
decomposition in his 2D code for asymptotically A3 spacetimes 0. 



2 For a definition of "true" in the context of boundaries see subsection III C. 
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Of course the properties of equation (0) depend on the parameter 77: For 77 = it is difficult 
to get a stable numerical scheme, for 77 = 2 there is no solution possible if / 0, for 77 = 3 
the discretised system can be written as symmetric matrix |6|, equation(5.1.18)], for 77 = 4 
the term ( (3) V a g) has vanishing coefficient, and for 77 = 8 the system possesses the same 
principal part as the Yamabe equation, which will play a crucial role in the next part of 
the series, where we describe how to generate minimal sets of data not representing known 
exact solutions. At least the later three choices for 77 deliver values of g which converge to 
the exact g with a smooth discretisation error and without loss of one order of convergence. 
Figure |2] shows the smoothness and the error for a run with a pretty coarse grid. 
The default choice in the code is 77 = 8, together with 5 a6 as 3-metric. To exclude the 
possibility of having calculated a spurious solution of equation (0), except for the 77 = case 
we have not shown uniqueness, we evaluate the constraints (1/14) to check consistency. 

To discretise we substitute 

d x f —> (fi+l,j,k — fi-l,j,k) ? (9 a ) 

9 x 2 f — > (Ai.* _ 2 Ai> fc + j (9b) 

and their equivalents for the y and z coordinates to obtain second order approximations, 
respectively 

9 x f — > ^~^ i+2 '^ k _ &fi-l,j,k + fi-2,j,k) , (10a) 

d x 2 f — > -^2(Ax) 2 (~~f i+2 <i> k ^fi+hhk ~ SOfijfi + — fi-2,j,k) , (10b) 

and their equivalents for the y and z coordinates to obtain fourth order approximations. 
The elliptic equation (0) then becomes a set of linear equations for the values gijk °f 9 a ^ 
gridpoints k), denoted by the vector u: 

Qu = r } (11) 

with some sparse matrix C depending on the stencils. The vector r is given by the prescribed 
boundary values and the right hand side of equation (|7|). The matrix C_ would become sin- 
gular if Q and d a Q vanished simultaneously at one gridpoint. For an extended hyperboloidal 
initial value problem this cannot happen. 

Equation ( |TTD is solved iteratively by using the algebraic multigrid library (AMG) by K. 
Stiiben from the Gesellschaft fur Mathematik und Datenverarbeitung. AMG analyses the 
algebraic structure of the matrix Q_ and derives from the structure a strategy to apply multi- 
grid techniques to accelerate the convergence rate of the iterative solution of equation (pT|) . 
Since the structure analysis happens automatically in AMG, it is very easy to program el- 
liptic solvers for different stencils or for grids with different topologies, one only needs to 
change the computation of C_ and r, but not the multigrid part. This convenience more than 
outweighs the computational overhead from analysing the algebraic structure of C_ once in 
each program run. The interested reader can find a detailed description of AMG in |7]||] . 
Before we end this section, we should make a remark about the accuracy which we can 
achieve. Increasing the number of gridpoints decreases the discretisation error. On the 
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other side, it is well-known that matrix inversion, here solving equation (|ll|), amplifies 
rounding errors. The amplification grows with the size of the matrix C_, which is determined 
by the number of gridpoints. Due to the amplification of the rounding errors refinement of 
the grid beyond a certain threshold will not improve the accuracy of the solution. When we 
surpass the 500 2 grid size in a fourth order scheme calculation, rounding errors become a 
visible contribution to the total error. The size of the relative error in the solutions of the 
elliptic solver is then of order 10 -11 . A remnant of that lower bound to the convergence of 
the scheme becomes visible in the curve for the 640 2 grid size in figure |[ 



III. THE DISCRETISATION OF THE TIME INTEGRATOR 



In this section, where we describe the discretisation of the time evolution equa- 
tions (1/13), we write our equations formally as 

d t f + AVW = b(f). (12) 

Our system of equation is, up to some simple algebraic manipulations, a quasilinear sym- 
metric hyperbolic system of first order. One of the characteristic properties of these kinds of 
systems is the existence of a maximal set of real characteristics and thus of a finite propaga- 
tion speed of signals. In implicit schemes the numerical speed of propagation is infinite, or 
at least very large. To mimic the finiteness of the propagation speed we therefore have to use 
explicit schemes. Moreover, we have to expect that parts of our slices run into singularities 
during time evolution. Due to the infinite numerical propagation speed in implicit schemes, 
the occurrence of a single gridpoint with singular values of a single variable would cease the 
calculation. In explicit schemes we can in principle continue the calculation to cover at least 
part of the remaining spacetime |9,10[]. For these reasons we have only evaluated explicit 
schemes. 



A. The second order scheme 

1. General considerations for the choice of the scheme 
The general form of explicit second order schemes in lDf\ is given by the so-called S% 



schemes ||1 1|| , which contain the widely used MacCormack and Lax-Wendroff schemes. Their 
extension to 2D and 3D is not unique and most of the schemes obtained distinguish cer- 
tain propagation directions. For many of the schemes with a distinguished direction, e. g. 
all generalisations of the MacCormack scheme, the occurrence of a weak instability could 
be shown already for the advection equation, if the propagation direction is opposite to 
the distinguished direction ||12||. This instability is of a strange character and hard to de- 



tect p2| , p]3[ . Since for quasilinear equations the propagation directions depend on the data 
and may change during the time evolution, those schemes, even if initially stable, may be- 
come unstable. 



'We call a calculation nD, if it uses an n-dimensional space grid. 
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To avoid distinguishing certain directions and dealing with these instabilities we implemented 
the rotated Richtmyer scheme, the extension of Lax-Wendroff which does not distinguish 
propagation directions. For our evolution equations combined with strong data even the ro- 
tated Richtmyer scheme turned out to be unstable. A grid mode with a wave length of ten 
gridpoints was not damped sufficiently Although we could make this grid mode vanish by 
adding artificial viscosity, the scheme was still not stable, a 20 gridpoint grid mode appeared 
later in the time evolution. In the linearised equations the growth rate of the instability was 
significantly weaker, the linearised equations without source were stable, as predicted by the 
theory. To be able to treat the principle part, which determines the propagation directions, 
by rotated Richtmyer and the sources by something else, we use Strang splitting. 

2. The implemented second order scheme 



In the Strang splitting ansatz one formally writes equation (|T2|) as an equation for the 
principal part, 

dJ + A\mi = 0, (13) 

and an equation for the sources, 

dj = h(f). (14) 

To integrate equation (|H|) we use the rotated Richtmyer scheme: 
First, we calculate a half grid, 

A = J_ ( f l , r l , A , A 

Li+l/2,j+l/2,k+l/2 2 3 V— i,j,k Li+l/2,j,k 1 Li,j+l/2,k 1 Li+l/2,j+l/2,k 

,A , A , A , A \ 

l-i,j,k+l/2 l-i+l/2,j,k+l/2 ±-i,j+l/2,k+l/2 ±-i+l/2,j+l/2,k+l/2J ' 

and the derivatives thereon, 
d f l = - f-f l + f l - f l + f l 

x Li+i/2,j+i/2,k+i/2 2 3_1 Aa; ^ i J' fc *+ 1 J. fc *-i,j+l,k l-i+l,j+l,k 

-f l + f 1 - f l + f l ) 

±-i,j,k+l ±-i+l,j,k+l ±-i,j+l,k+l l-i+l,j+l,k+lj ' 
9 *d +1 /2J+l/2,fc +1 /2 aIld 9 */I+l/2^+l/2,fc+l/2 111 aIlal °gy> 

to take the predictor step: 

f 1+1/2 f l _ ^_ A m( f l \fl A 

J_pi+l/2j+l/2,fc+l/2 Li+l/2,j+l/2,k+l/2 2 = ^+l/2,j+l/2,A:+l/2'' tym i-i+l/2,i+l/2,fc+l/2' 

Then, we average again, 

f i+1 /2 _ }_( f I i £ I , , I 

L-pi,j,k ~ 23 Vi.p<-l/2J-l/2,fe-l/2 "r i_-pi+l/2J-l/2,A;-l/2 "r i_pi-l/2,j+l/2,fe-l/2 

+Lp 

i+l/2,j+l/2,fe-l/2 -l/2j-l/2,fc+l/2 l/2,j'-l/2,fe+l/2 

-l/2j+l/2,fc+l/2 i+l/2J+l/2,fc+l/2 I > 

and again calculate the derivatives, 
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a f 1+1/2 - 1 f-f 1 _i_ f 1 

U xJ_Vi,j,k — 23-lAa; V i'pi-l/2J-l/2,fc-l/2 + /pi+l/2j-l/2,fc-l/2 

l/2,j+l/2,fc-l/2 

~~ /•pi-l/2j'-l/2,fc+l/2 + /pi+l/2J-l/2,fc+l/2 
~ Zpi-l/2J+l/2,fc+l/2 + fj?i+l/2,j+l/2,k+l/2) ' 

d,/^ 1 / 2 and fl.jg+V 2 in analogy, 

to take the corrector step: 

Lp&» = ^Zl,-fe := Z! jt k ~ At A m (lvi^ 2 )drnf_ v l ^l 2 ■ (15) 



To integrate the source equation (fUj) we use the pseudo-implicit Heun scheme [p!4| , since it 
is said to be similarly robust in the case of stiff equations as implicit schemes: 

lji,k = lji,j,k + A^(£j,jfe) 

fj^ /n = fj itj ,u + Y (^W + ^:Sr 1)/B )) » 2 < m < 
Here n is the number of iterations. 

There are two standard ways to combine the integration operator V for the principal 
part with the integration operator S for the source terms, the Strang I and the Strang 
II scheme JL5]. They have the disadvantage of consuming additional memory and requiring 
many loops over the grids. By using different schemes for the odd and even time steps, 
namely 



and 



fl+2n—l _ qqj fl+2n—2 
£-i,j,k J—i,j,k 

A+2n = vs A+2n-l 

—i,j,k i— i,j,k 1 v ' 



we avoid these disadvantages and obtain a scheme which is globally second order, although 
the coefficient of the leading term in the discretisation error jumps between odd and even 
steps. 

In the numerical implementation we use three grids to store the variables, the gauge source 
functions, and the intermediate values. Even if we calculated the gauge source functions by 
some global second order procedure, which were of course incompatible with hyperbolicity 
of the system, our scheme would be second order. 

B. The fourth order scheme 

In the 3D test calculations it turned out that to obtain the accuracy which we regard as 
necessary would require more computer resources than available to us. We therefore, and 
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due to the strong recommendations by H.-O. Kreiss, tried the "method of line" to build 
higher order schemes. Since the first results on 3D calculations on the Maxwell equations 
and the SU(2)-Yang-Mills equations confirmed the statement, that the required number of 
gridpoints per coordinate directions is reduced by a factor of five Ul6| , we combined the 
method of line with the conformal approach. 
In the method of line we formally write 

dtl=B{lAD> ( 18 ) 

where B_(f,dif) = —£(f)dif + b(f), and integrate the "ordinary differential equation" flT8|) 
by a scheme to integrate ordinary differential equations, in our case a fourth order Runge- 
Kutta scheme: 

/Ij,k = £,j,k + r (A,k + 2 ^5,fc /4 + 2 ^Sl 2 + l^ijT) ' ( 19 ) 



where 



k l+1/A 

—i,3,k 




rl+1/4 
z—i.j,k 


= f l - 

■> 1,3, 


+ -k l 


&i,i,k 




f 1+1/2 
?—i,j,k 


= f l - 


, 1 A- i+1/4 

k ' 2 — *^' fc 


^+3/4 




£1+3/4: 
£-i,j,k 


J i,3, 


, ^'+1/2 



To carry over the convergence order of the time integration we must calculate the space 
derivatives in the source term with appropriate stencils. Best results are obtained, especially 
if space gradients dominate the error, by pseudo-spectral methods |T7| , which are of infinite 



order in space. Nevertheless, we do not want to use pseudo-spectral methods, because we 
want to keep the freedom to continue the calculation after singular values appeared at some 
gridpoints on a slice. 

Instead of using spectral decompositions to calculate the space derivatives, we approximate 
derivatives by the symmetric fourth order stencil ( |10a| ), which stretches out two gridpoints 
to each side. To ensure stability we can, and actually we have to, add dissipative terms of 
higher order. For systems with constant coefficients theorem 6.7.1 and theorem 6.7.2 of ITS 



show stability. It is also discussed there how to extend results for systems with constant 
coefficients to systems with variable coefficients. 
The dissipation term suggested by the theorems is 



i— l,j,k 



- 20 4,, + 15 /U, fe - 6 £+*,3,k + J ' ( 20 ) 



which has a seven point stencil stretching out three gridpoints on each side. By adding 
a Q2 '■= qIn(Ax) 5 Y<iLi di 6 f with a sufficiently large a to each evaluation of B_(f, dif) we got 
a stable scheme. Numerical experiments showed that large values for a require small time 
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steps At for stability, therefore a should be chosen large enough, but as small as possible. 
The test cases have been run with a = 2, other data may require larger values. 
We also looked at a scheme which adds dissipation from a five point stencil after each full 
fourth order Runke-Kutta step, namely the term aQ 2 = ^^(Ax) 4 J2f=i di 4 f with 

® x Li,j,k ~ (Ax) 4 (— i ~ 2 <i' k ~ ^— ^— i,j,k ~ ^J-i+l,j,k Li+2,j,k) ' 

For cr = 2 this also seems to be stable but it only yields a third order scheme and was 
therefore not pursued any farther. 

In the numerical implementation we use four grids to store the variables, the gauge source 
functions, and the intermediate values. And again, even if we calculated the gauge source 
functions by some global fourth order procedure, the scheme would stay fourth order. 

C. The boundary treatment 

There are two kind of boundaries to be dealt with. The first kind, which we call false 
boundaries, comes from reducing the grid size, but not the grid dimension, by assuming 
discrete symmetries. In our case these are the periodic boundaries in the y and z directions 
in asymptotically A3 spacetimes and the boundaries coinciding with the symmetry planes 
in octant^ mode. On the false boundaries Q may assume any value. The second kind 
of boundaries, which we call true boundaries, restrict the treated range of the conformal 
spacetime. To avoid any significant influence of their treatment onto the physical spacetime 
they must be placed into regions with Q < 0. 

In the first part of this series we have described how we change the equations near the true 
boundaries. For the tests we used the modification (1/19), which freezes the time evolution 
near the true boundaries. 

Before we take a time step, we extend the grid on the true boundaries by the stencil width 
by a first order extrapolation. At false boundaries we extend by copying the values from the 
corresponding gridpoints in the interior. Then we take the time step and, of course, loose 
the just created gridpoints again. 

D. Controlling the time step 

It is well-known that the Courant-Friedrichs-Lewy condition, which states that the nu- 
merical domain of dependence must contain the analytic domain of dependence, is necessary 
for stability of symmetric hyperbolic schemes. This requirement restricts the maximal size 
of the time step. To calculate a linear approximation to its size we take the forward light 
cone at all gridpoints (i, j, k) and calculate the time Atmini j^ of its earliest intersection with 
the boundaries given by the neighbouring gridpoints with coordinate values Xj-i, Xi + \, j/j-i, 



4 Octant mode is obtained when assuming a mirroring symmetry with respect to the x, y, and z 
plane through the origin. 
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Vj+i, Zk-i, and Zk-i- The minimum over the grid At min := mm^fe) At^y^ would be the 
maximally allowed time step for the pure rotated Richtmyer scheme. Stability requirements 
not caused by the CFL-condition and the use of wider stencils modify the allowed time 
step to a multiply q of At min . We ran all calculations reported in this paper with the value 
9 = 1/2. 

The time step control by linearly approximating the light cone has consequences for what 
we can expect, when we check convergence of a scheme by subtracting the results from 
runs with different coarseness of the grids at corresponding time steps. Since the size of 
the time steps deviates by second order, corresponding slices have time coordinates which 
deviate by second order. The sequence of the differences will therefore seem only second 
order convergent, even if the scheme really converges with higher order. 



IV. THE TESTS PERFORMED 

A. The exact solutions used for testing 

To test the code we reproduce exact solutions of the conformal field equations by prescrib- 
ing the data and the gauge source functions from the exact solutions. To better understand 
the test runs we shortly recall those properties of the used exact solutions which are impor- 
tant for us. 

The first class of exact solutions, which we use in the 2D and 3D test runs, are the so-called 
asymptotically A3 solutions |Tj|. They are given by 

2 X 



9ab 



4 eM (~^ 2 + dx ^ + \ ^ 2 + x ^ ( eWd y* + e ~ Wdz ° 

7 {t s %s ) 



(22) 



where M and W are certain functions of t s and x s . 
If we set 



{M,W) = (0,0), 



we obtain the A3 solution. 
The choice 



(t s 2 + x s 2 f t 



(M, W) 



X a 



256 



(23) 



(24) 



yields a solution, which contains gravitational radiation ]2(J . Figure shows a conformal 
diagram of asymptotically A3 solutions with the periodic y and z coordinates suppressed. 
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FIG. 3. Conformal spacetime for 
the asymptotically A3 spacetimes 
with y and z coordinate suppressed 
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FIG. 4. Conformal Minkowski space- 
time with z coordinate suppressed 
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We start at to — — 1 an d integrate until t\ = —1/2. As long as to and ti are negative, their 
choice is completely arbritrary. The origin (0, 0) is singular, the components of the metric 
and the curvature become singular there. In principle we could continue the calculation 
beyond ti towards (t s ,x s ) = (0,0), and for test purposes we have done so. The CFL 
condition then forces us to use smaller and smaller time steps and we approach, but never 
reach, the origin. 

In the calculations reported here we have hidden the symmetries in the 2D calculations by 
the transformation 



/ 1, \ 


f f 


x s = 


X < 







t 

l + ~(x-l) [sin (n y ~ Vmin 
V 

and in the 3D calculations by the transformation 
( 

1) sin ( 



(25) 







X g 








\Zs) 





X i 1 + 2 [X - 



t 



\ 



2/max Vmu 

y 

z 



2 r 



sm 7T 



(-K Z Zmiu ) 

\ 2m ax -2min / 



(26) 



/ 



Another exact solution, we have tested against, is the conformal representation of Minkowski 
spacetime, which is in spherical coordinates (t, r, 8, (p) given by 




( - dt 2 + dr 2 + (sin r) 2 [d9 2 + (sinO) 2 'dip 2 } \ 
2 



\ 




tan 



t+r 



1 



tan 



t-r 



(27) 



/ 



Of course, in the actual calculation we have transformed to Cartesian coordinates, where 
g a b is no longer diagonal and the spherical symmetry is hidden. It should also be mentioned 
that this conformal solution is time dependent, since the conformal factor Q is. But the time 
dependence is in a certain sense very weak, already the test calculations for the spherically 
symmetric case in [21 have shown, that the accuracy obtained is order of magnitudes better 
than what is obtained in interesting cases. This statement is confirmed by the results of 
subsection [L V Df 

Due to our boundary treatment we can expect the reproduction of the exact solution and 
the propagation of the constraints only at gridpoints representing physical spacetime and 
J , although the solutions given exist outside the physical region. Therefore, we apply 
the measures for the quality of a solution, as defined in subsection [IV B| , only at the grid- 
points representing physical spacetime. In the used representation of conformal Minkowski 
spacetime we reach timelike infinity i + after a finite conformal time. Hence, the number 
of gridpoints representing physical spacetime decreases to zero. To have a significant part 
of the grid available to evaluate our accuracy measures we compare on the slice half way 
to i + , although the calculations have been continued beyond i + . Even when surpassing i + 
there is no change in the convergence of the scheme. By being able to cover timelike infinity 
with gridpoints we have a powerful numerical tool to study the fall-off behaviour of radiative 
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quantities for very long times flOf 



Another solution which is commonly used for testing codes in numerical relativity is the 
Schwarzschild solution. So, why is this code not yet tested against the Schwarzschild solu- 
tion? The reason is simple: The author is not aware of any explicit solutions of the conformal 
field equations representing the Schwarzschild-Kruskal spacetime, which is regular at both 
Ss and which is smoothly extendible across both Ss. By Birkhoff 's theorem, the theorems 
in [Q, and the theorems in we know that all the requirements above can be achieved, 
except that the solution is not explicit. When describing the initial data solver part of the 
code, we are going to prescribe spherically symmetric free functions with two Ss, which then 
necessarily yield data for the Schwarzschild-Kruskal spacetime. 



B. Measures for the quality of a numerical solution 

Without doubt, the best measure for the quality of a numerical solution is the difference 
to the exact solution: 

^ I,,,--/..,,,,:.- (28) 

where {i,j,k} denotes gridpoints representing physical spacetime or null infinity. Since 
gravitational radiation is given by the Newman-Penrose quantity ip4 of the conformal Weyl 
tensor d abc d evaluated at J ' , which is essentially a polynomial expression in E ab and B ab , the 
error in E ab and B ab is also a measure for the error in the gravitational radiation. As we 
do not prescribe all variables when providing an exact solution, but numerically calculate 
(1,1) R ab , E ab , and B ab by solving elliptic equations, the comparison of all variables with the 
exact solution is very time consuming, especially since the AMG library does not run in 
parallel. We therefore compare with the exact solution only once towards the end of a 
calculation. On the intermediate time steps we use what we call pseudo-difference, that 
is we calculate (h ab , k ab , ^ a bc , (0 ' 1) ^ a , /(i,i)£ a &, ^fEab, fBab, ^o, ^a, u) from the exact 
solution given by (g ab , Q) and compare with (h ab , k ab , ^ a bc , (0,1) #a, ^ {1,1) Rab, & 2 E ab , ^B ab , 
Q, Qq, Q a , to). Comparisons of the measure "difference" and "pseudo-difference" have been 
performed and the comparisons showed that the measures are measures exchangeable with 
respect to the relative error, which we define as 



f - f 

AJ—ij^k -i-exact i,j,k (nc\\ 
— reli,j,k : — 7TT7 IV v y J 

max ( i )i>fc )(l, l/^J) 



With respect to the absolute error the results differ, since e. g. Q 2 E ab , which is used in 
pseudo-difference, may be significantly smaller than E ab , and since the variables represent- 
ing second derivatives of the metric, like E ab , tend to dominate the absolute error. 

We also monitor the violation of the constraints. In later runs, which do not reproduce 
an exact solution, the convergence of the violation of the constraints to zero is our main 
criteria for consistency of the numerical time evolution with the Einstein equation, although 
we cannot conclude from the size of the numerical constraint violation to the quality of the 



numerical solution 22 . 



14 



When debugging and analysing the performance of the code we look at the measures point- 
wise by looking at surface plots on well-chosen two-dimensional slices. Although this implies 
looking at hundreds of plots to get representative samples, we regard it as unavoidable for 
a thorough testing and tuning of the code. Since we cannot present hundreds of figures in 
a paper we must condense for the presentation in this paper. To do so, we plot 



|A|IW:=-. 



1 ^ f \ 2 1 

— a / (Ai(x,y,z)) dydz~ — 



M 



J2 Yj (^HJ,k) ' 



(30) 



where M is the number of the entries in the vector A = A; of quantities which should go to 
zero. 

C. Tests of the 2D code 



In the first tests we ran we did not hide the obvious Killing vector d Vs in expression (g2j). 
The only remarkable thing to report about is, that in this case we could find a stable 
treatment of the true boundaries without changing the equations near the true boundaries 



as described in subsection [III C\ For solutions with hidden Killing vectors the first became 
unstable, forcing us to introduce the later treatment. 

From the numerical viewpoint, solution (^), the A3 solution without gravitational radiation, 
and solution (0), a solution with gravitational radiation, behave very similar. 

Figure |5| shows the convergence of the second order scheme in the measure (|30| ) , where A; 
denotes the pseudo-difference to the exact solution at t\ = —1/2. 




The lines plotted correspond to calculations with 40 2 (solid), 80 2 (dashed), 160 2 (dotted), 
320 2 (dotted), and 640 2 gridpoints (dotted). The values are scaled in such a way that the 
lines would coincide, if the convergence were exactly second order. The gridpoint numbers 
are with respect to the output grid which has a constant number of gridpoints independent 
of the grid on which the calculation is performed. Obviously in the calculation with 40 2 
gridpoints higher order terms still make a significant contribution to the error, the solid 
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line does not coincide with the dashed line. For finer grids we get the expected rate of 
convergence. 

Since the size of the error measure is not immediately related to the maximum norm of the 
error, we give it as well: The maximum of the absolute error drops in good agreement with 
the convergence rate from 4.44 in the 40 2 run to 0.01 in the 640 2 run. The variable with 
the largest error is in both cases En. The variable with the largest relative error is kn, its 
value drops, again in good agreement with the convergence rate, from 17% to 0.07%. 
Figure |7| shows the scaled measures for the violation of the constraints. 
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FIG. 7. Convergence of the violation of 
the constraints in an A3 like solution for 
the 2nd order scheme in 2D 
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FIG. 8. Convergence of the violation of 
the constraints in an A3 like solution for 
the 4th order scheme in 2D 



Since all the curves almost coincide, the violation of the constraints converges to zero in 
excellent agreement with the convergence order. 

Figure ^| is the analogue to figure |5| for the fourth order scheme, but with a fourth order 
scaling. We observe that already for the 40 2 run the error is dominated by fourth order 
contributions. Already for this coarsest grid, the error measure is one order of magnitude 
smaller than what we obtained for the second order scheme. This result is also confirmed by 
the figures for the maximum norm, which are: The maximal absolute error drops from 1.35 
to 2.1 x 10 -5 and is found as in the second order scheme in the variable En. In contrast 
to the second order scheme, the maximal relative error appears for the variable £12 and 
decreases from 6.2% to 1.0 x 10 _4 %. This is all in excellent agreement with fourth order 
convergence. 

Good, but not excellent agreement with fourth order convergence can be found in figure |8|, 
which shows the convergence of the violation of the constraints. The scaled curves for the 
40 2 (solid line) and the 80 2 (dashed line) deviate from the curves for runs with finer grids 
(160 2 and 320 2 ), which coincide very well. The dashed-dotted line representing the 640 2 
run deviates slightly around gridpoint 2 and 18 from the 160 2 and the 320 2 runs. This is 
not serious, in this run we have reached the lower accuracy bound at which rounding errors 
inherited from the "f2 divider" become significant (see also the discussion at the end of 
subsection [11 C| ). 
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D. For 3D 



1. Minkowski 



Figure |9| and figure [10] show the measure for the pseudo-difference for a second and a 
fourth order 3D run with 50 3 (solid line) and 100 3 (dashed line) with data for the Minkowski 
spacetime. 
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FIG. 9. Convergence against the Min- 
kowski solution for the 2nd order scheme 
in 3D 
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FIG. 10. Convergence against the Min- 
kowski solution for the 4th order scheme 
in 3D 



The solid line is scaled such that with exact convergence it would lie on the dashed line. 
Although the runs have been done on a full grid, despite the octant symmetry, the figures 
are based on an octant. We immediately see, that, due to the low number of gridpoints, 
the error of both schemes has contributions from higher order terms. These higher order 
contributions are more significant in the second order run. We do not regard this failure as 
very serious as the achieved accuracy is already extremely high. 

In the 100 3 run the maximum of the absolute error is extremely small, it is 6.6 x 10 -5 for 
(M) i?33 in the second order run and 2.2 x 10~ 8 for (1,1) i?33 in the fourth order run. The exactly 
same numbers hold for what we have defined as the maximal relative error. 
If we calculate under the assumption of second order convergence, what grid size we would 
need to achieve in a second order run the same accuracy with respect to the maximum of 
the relative error as in the 100 3 fourth order run, we get a hypothetical 5400 3 grid. Due 
to the already very small error in both schemes we regard that estimate as meaningless for 
practical purposes. 



2. A3 



Figure |TT] and figure [12] show the measure for the pseudo-difference for a second and a 
fourth order 3D run with 50 3 (solid line) and 100 3 (dashed line) on an A3 like spacetime 
with hidden y and z Killing symmetries. 
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FIG. 11. Convergence against an A3 like 
solution for the 2nd order scheme in 3D 



3.0x10 




Gridpoint number 



FIG. 12. Convergence against an A3 like 
solution for the 4th order scheme in 3D 



Again the solid line is scaled such that with exact convergence it would lie on the dashed 
line and we only plot an octant. The order of convergence is in excellent agreement with 
the prediction, the contribution of higher order terms to the error is almost negligible. 
With the 100 3 grid the maximum of the absolute error is 0.41 for En in the second order 
run and 0.077 for E n in the fourth order run. The largest relative error is 5.8% for B n in 
the second order run and 0.24% for E23 in the fourth order run. 

If we again calculate under the assumption of second order convergence, what grid size we 
would need to achieve in a second order run the same accuracy with respect to the maximum 
of the relative error as in the 100 3 fourth order run, we get a hypothetical 490 3 grid. With 
respect to the L 2 -Norm of the pseudo-difference we would even need 1000 3 gridpoints. 



V. CONCLUSION 

In this paper we have described a method and its numerical implementation to derive a 
complete set of data for the conformal field equations from a minimal set which works for 
all scenarios described in subsection III/B of |L|. 

Using data derived from a minimal set given by exact solutions we have tested our time 
integration code and compared a second with a fourth order scheme. This comparison 
turned out to be disastrous for the second order scheme. To reproduce the accuracy of a 
maximal relative error smaller than one percent in a situation with gravitational radiation, 
as achieved by a 100 3 grid in the fourth order run, which needs 1.7 GB of memory and 
7 hours on 16 processors of a origin 2000, we would need more than 200 GB memory and 
3000 hours on 16 processors for a run with the second order scheme. 
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APPENDIX A: REMARKS ABOUT THE COMPUTATIONAL SCIENCE 

ASPECT OF THE CODE 

We now give a short review of the computational science aspect of the code. 
To be able to do calculations of spacetimes with low symmetries, especially 3D calculations, 
it is highly recommendable to do the resource intensive parts of the calculations in parallel. 
On the other side, development of parallel program code tends to take much longer than 
writing serial code. It is therefore advantageous to be able to execute serial as well as parallel 
sections in one program, one then has the freedom to program less resource intensive tasks 
serial and to save human resources. The most used library for parallel computing, MPI, 
does not allow to combine parallel and serial code, since the library does not require the 
processors to share a common address space for their memory. We therefore decided to 
require shared memory, a modern hardware technology, which provides a common address 
space, and to use POSIX threads to achieve parallelism. 

In 2D and 3D time evolutions of the conformal field equations the code scales very well 
and on a 27 processor^ run we typically get a speed-up of 24. The remaining gap is mostly 
due to the variable work load per gridpoint caused by the change of the equations near 
the boundary as described in equation (1/19). For systems with a constant work load per 
gridpoint, e. g. 3D Maxwell equations on periodic grids, we come significantly closer to the 
optimal speed-up of 27. 

With the exception of dumping intermediate results to checkpoint, all output of data is done 
in the XDR format to generate hardware independent binaries. The output interface allows 
the caller to request and get the output of arbritrary rectangular sections of the grid with 
arbritrary coarseness for each coordinate. 

The interaction between the parts of the code which provide the computational infrastructure 
and the parts which are equation or boundary specific has been minimised. The code was 
successfully used to numerically solve initial value problems for the 3D Maxwell equations 
on periodic grids, the 3D SU(2) Yang-Mills equations on periodic grids, and the ID, 2D, 
and 3D conformal field equations for boundaries periodic in y and z direction, for normal 
boundaries in all directions, and for octant mode. 



In 2~3] we describe why we believe that a numerical relativity code should be able to deal 
with variables becoming singular at gridpoints. Since we want to also implement a similar 
strategy in more than ID later, the code is designed to allow a future extension to include 
handling and bookkeeping of singular gridpoints. Since the implementation of this extension 
was not necessary for the purposes of this paper, it is due. Its description will be the last 
part of this series. 



The maximal number of processors available for a single user on our computer. 
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